Search CORE

130 research outputs found

MPF: A portable message passing facility for shared memory multiprocessors

Author: Malony Allen D.
Mcguire Patrick J.
Reed Daniel A.
Publication venue
Publication date
Field of study

The design, implementation, and performance evaluation of a message passing facility (MPF) for shared memory multiprocessors are presented. The MPF is based on a message passing model conceptually similar to conversations. Participants (parallel processors) can enter or leave a conversation at any time. The message passing primitives for this model are implemented as a portable library of C function calls. The MPF is currently operational on a Sequent Balance 21000, and several parallel applications were developed and tested. Several simple benchmark programs are presented to establish interprocess communication performance for common patterns of interprocess communication. Finally, performance figures are presented for two parallel applications, linear systems solution, and iterative solution of partial differential equations

NASA Technical Reports Server

Parallel discrete event simulation: A shared memory approach

Author: Malony Allen D.
Mccredie Bradley D.
Reed Daniel A.
Publication venue
Publication date
Field of study

With traditional event list techniques, evaluating a detailed discrete event simulation model can often require hours or even days of computation time. Parallel simulation mimics the interacting servers and queues of a real system by assigning each simulated entity to a processor. By eliminating the event list and maintaining only sufficient synchronization to insure causality, parallel simulation can potentially provide speedups that are linear in the number of processors. A set of shared memory experiments is presented using the Chandy-Misra distributed simulation algorithm to simulate networks of queues. Parameters include queueing network topology and routing probabilities, number of processors, and assignment of network nodes to processors. These experiments show that Chandy-Misra distributed simulation is a questionable alternative to sequential simulation of most queueing network models

NASA Technical Reports Server

Visualizing Parallel Computer System Performance

Author: Malony Allen D.
Reed Daniel A.
Publication venue
Publication date
Field of study

Parallel computer systems are among the most complex of man's creations, making satisfactory performance characterization difficult. Despite this complexity, there are strong, indeed, almost irresistible, incentives to quantify parallel system performance using a single metric. The fallacy lies in succumbing to such temptations. A complete performance characterization requires not only an analysis of the system's constituent levels, it also requires both static and dynamic characterizations. Static or average behavior analysis may mask transients that dramatically alter system performance. Although the human visual system is remarkedly adept at interpreting and identifying anomalies in false color data, the importance of dynamic, visual scientific data presentation has only recently been recognized Large, complex parallel system pose equally vexing performance interpretation problems. Data from hardware and software performance monitors must be presented in ways that emphasize important events while eluding irrelevant details. Design approaches and tools for performance visualization are the subject of this paper

NASA Technical Reports Server

Performance analysis integration in the Uintah software development cycle

Author: De St Germain John Davison
Malony Allen D.
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 01/01/2003
Field of study

ManuscriptThe increasing complexity of high-performance computing environments and programming methodologies presents challenges for empirical performance evaluation. Evolving parallel and distributed systems require performance technology that can be flexibly configured to observe different events and associated performance data of interest. It must also be possible to integrate performance evaluation techniques with the programming paradigms and software engineering methods. This is particularly important for tracking performance on parallel software projects involving many code teams over many stages of development. This paper describes the integration of the TAU and XPARE tools in the Uintah Computational Framework (UCF). Discussed is the use of performance mapping techniques to associate low-level performance data to higher levels of abstraction in UCF and the use of performance regression testing to provides a historical portfolio of the evolution of application performance. A scalability study shows the benefits of integrating performance technology in building large-scale parallel applications

The University of Utah: J. Willard Marriott Digital Library

05501 Abstracts Collection -- Automatic Performance Analysis

Author: Gerndt Hans Michael
Malony Allen
Miller Barton P.
Nagel Wolfgang
Publication venue: Dagstuhl Seminar Proceedings. 05501 - Automatic Performance Analysis
Publication date: 01/01/2006
Field of study

From 12.12.05 to 16.12.05, the Dagstuhl Seminar 05501 ``Automatic Performance Analysis\u27\u27 was held in the International Conference and Research Center (IBFI), Schloss Dagstuhl. During the seminar, several participants presented their current research, and ongoing work and open problems were discussed. Abstracts of the presentations given during the seminar as well as abstracts of seminar results and ideas are put together in this paper. The first section describes the seminar topics and goals in general. Links to extended abstracts or full papers are provided, if available

Dagstuhl Research Online Publication Server

A 3D Finite-Difference BiCG Iterative Solver with the Fourier-Jacobi Preconditioner for the Anisotropic EIT/EEG Forward Problem

Author: Aleksej Zherdetsky
Alena Prakonina
Allen D. Malony
Sergei Turovets
Vasily Volkov
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2014
Field of study

The Electrical Impedance Tomography (EIT) and electroencephalography (EEG) forward problems in anisotropic inhomogeneous media like the human head belongs to the class of the three-dimensional boundary value problems for elliptic equations with mixed derivatives. We introduce and explore the performance of several new promising numerical techniques, which seem to be more suitable for solving these problems. The proposed numerical schemes combine the fictitious domain approach together with the finite-difference method and the optimally preconditioned Conjugate Gradient- (CG-) type iterative method for treatment of the discrete model. The numerical scheme includes the standard operations of summation and multiplication of sparse matrices and vector, as well as FFT, making it easy to implement and eligible for the effective parallel implementation. Some typical use cases for the EIT/EEG problems are considered demonstrating high efficiency of the proposed numerical technique

Crossref

Directory of Open Access Journals

PubMed Central

Kernel-Level Measurement for Integrated Parallel Performance Views: the KTAU Project

Author: Alan Morris
Allen D. Malony
Aroon Nataraj
Sameer Shende
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2006
Field of study

The effect of the operating system on application perfor-mance is an increasingly important consideration in high performance computing. OS kernel measurement is key to understanding the performance influences and the interre-lationship of system and user-level performance factors. The KTAU (Kernel TAU) methodology and Linux-based framework provides parallel kernel performance measure-ment from both a kernel-wide and process-centric perspec-tive. The first characterizes overall aggregate kernel per-formance for the entire system. The second characterizes kernel performance when it runs in the context of a partic-ular process. KTAU extends the TAU performance system with kernel-level monitoring, while leveraging TAU’s mea-surement and analysis capabilities. We explain the rational and motivations behind our approach, describe the KTAU design and implementation, and show working examples on multiple platforms demonstrating the versatility of KTAU in integrated system / application monitoring. 1

CiteSeerX

Crossref

Bacatá: A Language Parametric Notebook Generator (Tool Demo)

Author: Fowler Martin
Heering Jan
Klint Paul
Kluyver Thomas
Knuth Donald E.
Malony Allen D.
Sametinger Johannes
Turner Phil
Publication venue
Publication date: 01/01/2018
Field of study

\u3cp\u3eInteractive notebooks allow people to communicate and collaborate through a single rich document that might include live code, multimedia, computed results, and documentation, which is persisted as a whole for reproducibility. Notebooks are currently being used extensively in domains such as data science, data journalism, and machine learning. However, constructing a notebook interface for a new language requires a lot of effort. In this tool paper, we present Bacatá, a language parametric notebook generator for domain-specific languages (DSL) based on the Jupyter framework. Bacatá is designed so that language engineers may reuse existing language components (such as parsers, code generators, interpreters, etc.) as much as possible. Moreover, we explain the design of Bacatá and how DSL notebooks can be generated with minimum effort in the context of the Rascal meta programming system and language workbench.\u3c/p\u3

Crossref

Repository TU/e

CWI's Institutional Repository

Pure OAI Repository

Performance Evaluation of Adaptive Scientific Applications using TAU

Author: Alan Morris A
Allen D. Malony A
Germain B
J. Davison De St
Sameer Shende A
Steven Parker B
Publication venue
Publication date: 02/04/2008
Field of study

Fueled by increasing processor speeds and high speed interconnection networks, advances in high performance computer architectures have allowed the development of increasingly complex large scale parallel systems. For computational scientists, programming these systems efficiently is a challenging task. Understanding the performance of their parallel applications i

CiteSeerX